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1. 


INTRODUCTION 


In  March  1978,  Meteorology  International  Incorporated  (Mil)  completed 

a  project  [1]  on  behalf  of  Fleet  Numerical  Oceanography  Center1  (FNOC) 

wherein  over  16.5  million  synoptic  observations  made  by  ships  at  sea  were 

blended  with  archived  meteorological  fields  to  produce  re-analyzed  sea-level 
2 

pressure  fields  for  the  Northern  Hemisphere  on  a  63x63  grid  (polar 
stereographic  projection).  These  sea-level  pressure  fields,  covering  a 
period  of  30  years  (1946-1975)  at  6-houriy  increments,  were  used  to 
diagnose  sea-level  pressure  fields  for  the  same  times.  To  check  the 
effectiveness  and  validity  of  the  Mil-developed  diagnosis  algorithm  for  use 
with  the  re-analyzed  pressure  fields,  a  statistical  analysis  of  diagnosed 
wind  speeds  versus  observed  wind  speeds  was  carried  out.  Two  years 
were  selected  (1964  and  1965,  together  representing  over  2.3  million  wind 
reports  from  ships)  and  actual  wind  speeds  were  compared  with  wind 
speeds  diagnosed  from  the  re-analyzed  pressure  fields  using  the  wind 

3 

algorithm.  In  making  the  comparison,  of  course,  the  diagnosed  wind 
speeds  were  for  the  same  location  as  the  corresponding  reported  wind 

4 

speed.  The  results  were  shown  in  the  form  of  a  scatter  diagram  together 
with  a  variety  of  derived  statistics.  It  was  found  that  there  was  no 
overall  bias  of  any  significance — in  other  words,  in  the  mean,  observed 
winds  were  in  agreement  with  winds  diagnosed  from  analyzed  sea-level 
pressure  fields. 

These  results  directly  contradicted  similar  results  produced  by  an 
FNOC  field  verification  capability  known  at  that  time  as  OPL25;  for 

*At  that  time  known  as  Fleet  Numerical  Weather  Central  (FNWC). 

2 

The  analysis  system  used  was  an  application  of  the  general-purpose 
Fields  by  Information  Blending  (FIB)  methodology. 

3 

Previous  investigations  had  determined  that  the  MU  wind  algorithm 
produced  no  systematic  bias  with  regard  to  wind  direction. 

h 

The  scatter  diagram  arising  from  the  comparison  is  given  later  in 
this  Report — see  Table  5. 

5lt  should  be  noted  that  OPL2  was  developed  for  FNOC  by  another 
contractor. 
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example  this  program  usually  indicated  that  diagnosed  (and  analyzed) 
winds  resulting  from  applications  of  FIB  methodology  were  too  low  when 
compared  with  observed  winds.  Doubt  was  thrown  on  the  OPL2  field 
verification  scheme.  Since  OPL2  was  being  used  to  "verify"  and  tune 
a  number  of  FNOC-produced  fields  of  environmental  parameters,  both 
analyzed  and  forecast,  a  careful  review  of  the  0PL2  program  and  its 
results  was  indicated. 

This  review  was  carried  out  by  Holl  [2],  By  defining  hypothetical 
interrelated  distributions  of  "observed"  values,  "field"  values  and  "true" 
values  of  an  arbitrary  environmental  parameter,  Holl  clearly  demonstrated 
that  verification  of  fields  based  on  comparison  with  concurrent  observations 
was  (and  continues  to  be)  subject  to  much  misunderstanding  and  confusion. 
As  will  be  seen,  the  consequences  of  Holt's  study  are  far-reaching 
concerning  the  subject  of  field  verification  statistics  and  their 
interpretation. 

Having  completed  the  review  of  OPL2  (which  led  to  a  study  of  field 
verification  statistics  in  general  terms).  Mil  was  tasked  with  programming 
and  implementing  a  valid  field  verification  capability.  This  capability, 
now  part  of  the  FNOC  system,  is  known  as  0PL2X.  The  current  version 
of  OPL2X  originally  was  developed  as  Mil  Project  Number  M-240  [3]  and 
later  refined  under  Mil  Project  Number  M-245-08.1 

It  may  be  noted  that  even  0PL2X  results  can  be  misinterpreted  if  the 
User  has  insufficient  background  knowledge  and  understanding;  conclusions 
which  seem  intuitively  obvious  are  often  invalid  when  examined  in  the 
light  of  Holl's  study  [2],  This  Report,  produced  as  part  of  Mil  Project 
Number  M-24S-08,  is  intended  to  provide  the  User  with  the  necessary 
insight  into  the  subject  of  field  verification  statistics,  it  gathers  together 
the  essential  elements  of  Holl's  study  (Sections  2  through  5),  and  describes 
the  capabilities  and  output  of  the  current  version  of  the  0PL2X  program 
for  the  verification  of  fields  based  on  comparison  with  concurrent 
observations. 


^Service  Order  Number  QE-08  performed  under  Contract  Number 
N00228-79-D-9689. 


2. 


BASIC  CONSIDERATIONS 


2.1  Introduction 

The  subject  fields  which  are  to  be  verified  may  have  been  produced 
by  an  analysis  capability,  or  by  a  parameter  diagnosis  algorithm,  or  by  a 
prediction  capability.  By  analysis  it  is  generally  implied  that  the  production 
process  includes  assimilation  of  observations  which  are  concurrent  with  the 
applicable  time  range,  or  time  distribution,  of  the  field.  The  term  diagnosis 
generally  implies  that  no  concurrent  observations  of  the  object  parameter 
have  been  directly  assimilated  in  the  production  of  the  field.  An  example 
is  a  geostrophic  wind  field  diagnosed  entirely  from  the  concurrent  surface 
pressure  field.  A  predicted  field  generally1  implies  that  no  concurrent 
observations  of  any  kind  have  been  assimilated  in  the  production. 

A  field  may  be  that  of  a  scalar  parameter  such  as  significant  wave 
height,  or  a  vector  parameter  such  as  a  surface  wind  field.  The  horizontal 
wind  velocity  may  be  prescribed  in  terms  of  two  orthogonal  components  which 
can  be  transformed  into  speed  and  direction. 

A  set  of  observations  concurrent  with  a  field  (however  produced), 
paired  with  the  set  of  values  specified  by  the  field  at  the  corresponding 
locations,  forms  a  bivariate  distribution.  This  data  set  is  to  be  exploited 
for  verifying  the  field  and  for  evaluating  the  capability  which  produced  it. 

The  subject  fields  which  are  routinely  verified  by  FNOC  include 
marine  wind  speed  and  direction,  and  significant  wave  height.  The  present 
study  is  here  developed  using  the  wind  speed  as  primary  example.  However 
the  applicability  extends  to  other  fields  as  well. 

It  is  common  to  calculate  the  mean  difference  and  the  Root-Mean- 
Square  (RMS)  difference  between  field  and  observed  values  as  a  function 
of  the  observed  value.  That  is,  the  bivariate  data  set  is  stratified 
according  to  ranges  of  observed  value.  These  quantities  often  are 
interpreted  as  the  average  error  and  the  RMS  error  for  each  range. 


*The  terms  "semi-predicted  fields"  and  "diagnostic-cycle  routine" 
have  been  used  to  identify  fields  of  a  parameter  which  are  produced  with 
the  aid  of  information  in  analyzed  scenarios  of  other  parameters. 
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These  labels  are  wrong.  The  implications  are  misleading.  We  shall  show 
that  these  quantities  have  little  direct  bearing  on  field  verification. 


2.2  The  Objective  of  a  Field 

Fields  are  numerically  represented  by  a  finite  set  of  rounded  values, 
together  with  an  interpretation  scheme  for  defining  the  field  continuum  from 
these  values.  Two  types  are  common: 

(1)  An  array  of  grid-point  values,  with  grid  points  referring  to 
uniformly  spaced  positions  over  the  region  of  the  field, 
together  with  a  field-interpolation  formula  or  formulas.1 

(2)  A  set  of  values  which  are  interpreted  as  the  combination 
coefficients  of  a  finite  set  of  component  fields  defined  as 
being  double  Fourier  Functions  of  a  discrete  set  of 
wavelengths,  or  as  some  other  specified  set  of  functionally- 
defined  elemental  fields. 

In  the  present  study  we  will  generally  have  the  first  type  in  mind.  However 
the  applicability  extends  to  other  types  as  well. 

The  specification  of  the  field  interpretation  scheme  is  often  overlooked 
in  the  case  of  grid-point  representation  of  a  field;  it  is  left  to  the  User  to 
design  his  own  scheme.  This  license  is  bad.  Not  only  may  different 
values  be  extracted  for  the  same  locations  but  also  consistencies  between 
the  object  field  parameter  and  other  fields  may  be  violated  in  the  process. 
The  arbitrary  interpolation  of  ocean  parameters  should  not  be  allowed  at 
all  because  continuity  in  places  is  interrupted  between  adjacent  grid  points. 
An  example  of  destroying  consistencies  is  the  destabilization  of  static 
stability  in  arbitrary  interpolation  of  atmospheric  mass- structure  parameters. 


^Fields  of  ocean  parameters,  such  as  Sea-Surface  Temperature,  produced 
by  the  FIB  analysis  capability  are  interpreted  by  one  scheme  of  grid 
interpolation  based  on  zero-and-first-order  continuity  in  open-sea  areas, 
and  by  another  scheme  of  grid  interpolation  in  coastal  regions  where 
continuity  is  interrupted  by  land. 


The  number  of  values  in  the  finite  set,  and  their  rounding,  lower 
bound  the  range  of  scale  which  may  be  defined  by  a  numerically-expressed 
field.  A  specified  time  span  defines  the  lower  bound  of  the  range  of  scale 
in  time.  The  field  objective  is  to  specify  grid-point  values  which  are 
representative  in  that  lower-bounded  range  of  scale.  A  perfect 
numerically-expressed  field  defines  what  we  may  call  the  "true"  values  at 
the  grid  points  or  at  any  other  interpolated  location  in  the  field.1 

What  is  meant  by  a  representative  value  in  the  case,  say.  of  a  wind 
field— a  horizontal  vector,  \V,  of  two  components?  One  often  finds  this 
question  to  be  neglected,  and  field  production  capabilities  are  developed 
without  such  consideration. 

in  the  context  of  atmospheric  dynamic  significance,  and  operational 
significance,  the  usual  tacit  objective  is  grid-point  values  of  the  wind 
which  are  representative  of  momentum  and  mass  fluxes,  usually  ignoring 
the  smalier-order  horizontal  variations  in  air  density.  In  this  context  the 
representative  values  are  those  which  average-out  the  subscale.  Where 
several  wind  observations  are  available  from  a  very  small  region  relative 
to  the  grid  array,  the  average  of  these  winds  is  the  best  contributing 
estimate  of  the  locally  representative  wind. 

There  are  other  objectives  in  analyzing  a  wind  field.  Consider  the 
context  of  ocean  wave  generation  by  wind  stress.  The  emphasis  is  on  field 
values  which  are  most  representative  of  the  effective  wind  stress  in  the 
lower-bounded  range  of  scale  afforded  by  the  grid-point  array.  In  this 
context  it  is  more  germane  to  analyze  the  field  in  terms  of  V  \V  as  object 
parameter,  in  components  Vu  and  Vv,  where  V  is  the  wind  speed  and  u  and 
v  are  the  velocity  components. 

In  combining  wind  observations  in  close  proximity  the  combined 
estimate  differs  in  the  two  defined  contexts.  For  example,  consider  three 
parallel  winds  of  different  speeds,  say,  10,  11  and  15  units.  The 


It  is  very  necessary  to  this  discussion  to  appreciate  the  differences 
between  an  observed  value,  a  field  value,  a  true  value  and  a  representative 
value,  in  fact  many  misinterpretations  of  field  verification  statistics  directly 
arise  from  confusing  one  of  these  quantities  with  another,  A  detailed 
discussion  is  provided  in  Reference  [4]. 
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momentum-mean  wind  speed  is  12,  and  the  stress-mean  wind  speed  is  the 
RMS  value,  closer  to  12.2.  Additional  allowance  for  gusts,  subscale  in 
time  to  anemometer  readings, 1  is  also  in  order  in  the  context  of  wind 
stress. 

For  simplicity  we  shall  proceed  with  the  momentum-wind  context, 
with  \V  as  object  parameter.  However  the  study  applies  equally  to  the 
stress-wind  context,  with  V  V  as  object  parameter. 

It  should  be  clearly  understood  that  a  wind  observation  at  a  station 
location  is  not  the  same  as  the  field-objective  true  wind  at  that  location. 
Contributing  variances  in  observations  relative  to  the  true  wind  include: 

(1)  Observational  errors  such  as  instrument  errors  and/or  errors 
in  subjective  estimates. 

(2)  Subscale  variance.  This  includes  the  subscale  in  both  space 
and  time. 

(3)  Gross  errors  such  as  errors  introduced  in  transcribing  and 
transmitting  observations  and  their  position. 

We  are  primarily  concerned  with  the  second  of  these  contributions. 

It  should  also  be  appreciated  that  in  attempting  to  verify  a  field  by 
comparing  concurrent  observations  with  corresponding  field  values,  the  true 
value  corresponding  to  each  pair  is  not  known.  The  produced  field  values 
are  themselves  estimates  of  the  field-objective  true  values. 


2.3  A  Bivariate  Frequency  Distribution 

Consider  the  idealized  frequency  distribution  of  rounded-value  pairs 
given  by  Table  1.  This  sample  would  apply,  for  example,  to  the  case  in 
which  the  observed  and  the  field  values  have  similarly  distributed  variances 
relative  to  the  true  values,  all  without  bias. 


^Wind  observations  from  ships  at  sea  may  be  based  on  anemometer 
readings  and/or  on  subjective  estimates  based  on  the  character  of  the  wind¬ 
blown  sea.  A  skilled  observer  converts  the  appearance  of  the  sea  into  a 
corresponding,  effective  stress-mean  wind. 
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Each  element  of  the  array  gives  the  number  of  times,  Np  q,  that 
the  field  category,  F,  occurs  paired  with  the  observed  category,  B.  The 
true  value,  T,  corresponding  to  each  F,B  pair  is  not  included.  Inclusion 
of  T  would  require  a  third  dimension  for  the  frequency  table  of  entries 

nf,b,t* 

The  total  number  of  times  that  the  observed  value,  B,  occurs  is 
given  by  the  column  totals,  Ng  =  ^  Np  g.  Similarly  the  row  totals  give 
Np  =  g  Np  g.  The  distribution  is  symmetric  about  the  diagonal.  The 
frequencies  favor  the  lower  categories,  with  the  value  2  appearing  most 
often. 

We  will  make  use  of  such  frequency  tables  in  several  demonstrations. 

To  each  observed  category,  B,  there  corresponds  a  frequency 
distribution  of  field  values,  F.  Similarly,  a  frequency  distribution  of 
observed  values,  B,  corresponds  to  each  field  category,  F. 

The  mean  field  value,  Fg,  corresponding  to  an  observed  value  of  B 
is  greater  than  B  in  the  categories  B  =  0  and  1,  equal  in  2,  and  is  less 
than  B  in  categories  B  =  3  through  9.  A  verification  statistics  package 
based  entirely  on  such  stratification  may  give  the  misleading  label  of 
"field  error  as  a  function  of  observed  value"  to  the  difference  between 
Fg  and  B.  The  implication  here  is  that  the  field  values  are  too  small  in 
the  range  3  through  9. 

Similar  reasoning  leads  to  the  opposite  conclusion  if  stratification  is 
based  on  field  value.  (Considering  that  the  fields  are  the  products 
subject  to  user  interpretations  and  applications  this  approach  seems  more 
appropriate.)  The  mean  observed  value.  Bp,  corresponding  to  a  field 
value  of  F  is  greater  than  F  in  the  categories  F  =  0  and  1,  and  is  less 
than  F  in  categories  F  =  3  through  9.  The  implication  now  is  that  the 
field  values  are  too  large*  in  the  range  3  through  9! 


Vhe  original  0PL2  package  stratified  field  values  by  observed 
categories.  Needless  to  say,  in  general  it  was  concluded  that  field  values 
were  too  low.  Based  on  this  conclusion  some  numerical  models  were  tuned 
to  remove  the  non-existent  "bias".  Had  stratification  been  based  on  field 
categories,  tuning  in  the  opposite  sense  would  have  been  required  to 
remove  the  bias.  Both  "biases"  would  have  arisen  from  the  same  input 
data. 
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Table  1 

An  idealized  symmetric  bivariate  frequency  distribution. 


Row  Field 

Total  Value 


0123456789  Observed  Value 
10  33  41  37  27  17  9  8  4  2  Column  Total 


I 

Mean  values  for  each  class  are  as  follows; 


Class: 

0 

1  2  3  4  5 

6 

7 

8 

9 

PB  : 

0.8 

1.4  2.0  2.8  3.8  4.8 

5.8 

6.8 

7.8 

8.5 

%  ■■ 

0.8 

1.4  2.0  2.8  3.8  4.8 

5.8 

6.8 

7.8 

8.5 
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We  shall  demonstrate  that  neither  approach  can  be  interpreted  to 
imply  that  the  field  values  have  been  produced  with  a  bias  relative  to  the 
field-objective  true  values. 

Note  that,  for  the  frequency  distribution  given  by  Table  1,  the 
overall  mean  of  the  difference  between  observed  and  field  values  is  zero. 
That  is. 


E 


Np  B  (F  -  B)  =  0 


(1) 


In  fact  it  follows  from  the  symmetry  of  the  distribution  that  all  moments 
about  the  diagonal — the  one-to-one  relationship — are  zero,  and  the 
frequency  distribution  of  field  values,  Np,  is  identical  to  that  of  observed 
values,  Ng. 
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3.  CONCEPTS  AND  DEMONSTRATIONS 

3.1  A  Perfect  Field 

To  explore  the  properties  of  bivariate  samples  we  begin  with  an 
idealized  perfect  field.  In  this  example  we  give  the  object  parameter  a 
preponderance  of  lower  values  over  higher  values.  This  is  characteristic 
of  such  fields  as  wind  speed  and  wave  height. 

Consider  that  a  large  number  of  observations  are  clustered  around 
individual  field  values.  The  field  value  of  a  perfect  analysis  is  the  local 
mean  of  these  observations.  Our  example  is  simplified  to  seven 
observations  per  cluster,  giving  one  field  value  at  that  location  in  the 
object  range  of  scale.  (The  seven  observations  can  be  increased  to 
whatever  it  takes  to  create  confidence  in  the  corresponding  field  value.) 
Observational  values  have  been  chosen  to  give  integer  mean  values.  In 
order  to  achieve  the  desired  preponderance  of  lower  values,  each  pair 
arising  from  a  combination  of  a  cluster  of  observations  with  a  corresponding 
field  value  occurs  a  specified  number  of  times  in  the  field. 

The  sample  consists  of  the  following: 


B  Values  in  Cluster 

No.  of 

Such  Clusters 

Corresponding 

F  Value  (=  B^) 

No.  of 
Pairs 

6,6.6,7, 7,8.9 

1 

7 

7 

5,  5,  5, 6, 6,  7. 8 

2 

6 

14 

4, 4, 4,  5, 5. 6.  7 

4 

5 

28 

3, 3, 3, 4, 4,  5. 6 

8 

4 

56 

2, 2, 2, 3, 3, 4, 5 

10 

3 

70 

1,1, 1,2, 2,3,4 

11 

2 

77 

0,0,0,1, 1,2,3 

9 

1 

Sample  Total : 

63 

315  Pairs 

The  corresponding  bivariate  frequency  distribution  of  315  pairs  is  tabulated 
in  Table  2.  In  this  idealized  example  of  a  perfect  field,  field  values 
achieve  the  desired  field-objective  true  values,  i.e.,  F  =  =  T.  What 

else  can  we  learn  from  Table  2? 
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Table  2 

Frequency  distribution  for  an  idealized  perfect  field 


Np  F 


0  9 

0  8 

7  7 

3  © 

14  6 

6  ©  2 

28  5 

12  (?)  4  4 

56  4 

24  @  8  8 

70  3 

30  (20)  10  10 

77  2 

33  (22)  11  11 

63  1 

0  0 

B: 

0  1  2  3  4  5  6  7 

nb: 

27  51  61  64  49  32  19  8 

1  1.6  2.3  2.9  3.6  4.3  5.1  5.8  6.3 


The  mean  field  value,  Fg,  corresponding  to  an  observed  value  of  B 
is  greater  than  B  in  the  categories  B  =  0,  1  and  2,  and  is  less  than  B  in 
categories  B  =  3  through  9.  Based  on  this  stratification— categories  of 
B— are  the  field  values  really  biased?  Clearly  not — we  started  with  a 
perfect  and  unbiased  field!  Stratification  of  the  sample  into  ranges  of  B 
is  misleading. 

We  also  note  that  the  frequency  distributions  Np.  and  Ng  do  not 
match,  even  for  this  perfect  field.  Hence  such  comparison  is  not  a  good 
indicator  for  field  verification. 

The  overall  (i.e.,  unstratified)  difference  between  pairs  is  zero: 


E 


Nf  b  (F  -  B)  =  0 


(2) 


The  mean  squared  difference  is  non-zero: 


1 

FT 


E 


(F  -  B)2 


1.143 


(3) 


This  is  the  subscale  variance  in  the  observations  that  the  field,  which  is 
lower  bounded  in  scale,  cannot  accommodate. 


3.2  An  Imperfect  Field  Without  Bias 

For  our  second  fabrication  we  modify  the  preceding  example  of  a 
perfect  field  to  include  variance  in  the  field  values,  F,  relative  to  the 
field-objective  true  values,  T.  Each  cluster  of  observations  is  specified 
to  occur  four  times  as  often  as  in  the  preceding  example,  with  each  of  the 
four  subsets  assuming  a  field  value  spread  about  the  true  value,  spread 
symmetrically  without  bias. 
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The  sample  consists  of  the  following : 


B  Values  in  Cluster 

No.  of 

Such  Clusters 

T  Value 

F  Values 

No.  of 
Pairs 

6, 6, 6,  7,  7,8,9 

4  x  1 

7 

6,  7,  7,8 

28 

5,  5,  5, 6, 6,  7, 8 

4x2 

6 

5, 6, 6, 7 

56 

4,  4, 4, 5,  5,  6,7 

4x4 

5 

4,5,  5,6 

112 

3, 3, 3, 4, 4,  5, 6 

4x8 

4 

3, 4,4,5 

224 

2, 2, 2,  3, 3, 4,  5 

4  x  10 

3 

2.  3, 3, 4 

280 

1,1, 1,2, 2,3,4 

4  x  11 

2 

1. 2,2,3 

308 

0,0,0,1, 1,2,3 

4x9 

1 

0,1, 1,2 

252 

Sample  Total:  1,260  Pairs 

We  have  constructed  a  trivariate  distribution — trios  consisting  of  an 
observation,  a  corresponding  field  value,  and  a  corresponding  true  value.1 
The  bivariate  distribution  of  observed-and-fieid  pairs  is  tabulated  in  Table  3. 
The  bivariate  distribution  of  field-and-true  pairs  is  tabulated  in  Table  4. 

According  to  Table  3,  the  mean  observed  value.  Bp,  corresponding 
to  a  field  value,  F.  is  greater  than  F  in  the  categories  F  =  0  and  1,  and  is  less 
than  F  in  categories  F  =  3  through  8.  Table  4  shows  that  Tp  matches  Bp 
of  Table  3.  These  resultants  cannot  be  denied;  overall,  Bp  is  a  better 
approximation  of  T.  Does  this  imply  that  the  capability  which  produced 
the  field  is  biased?  Can  the  field  be  improved  by  rescaling  F  to  match  Bp? 
The  answer  to  both  questions  is  no.  Contrary  to  intuitive  reasoning  such 
results  do  not  imply  a  biased  field. 

The  production  capability  produced  an  unbiased  distribution  of  F  in 
each  category  of  true  value,  T;  i.e.,  Fy  =  T.  The  bias  in  Tp  versus  F  is 
a  consequence  of  the  frequency  distribution  of  values  in  the  sample.  For 
example,  there  are  56  cases  of  the  true  value  4  which  are  interpreted  by 
the  field  as  5's  but  only  14  cases  of  the  true  value  6  are  interpreted  by 
the  field  as  5's.  Stratification  by  field  values  brings  in  an  unbalanced 
population  of  true  values. 


*This  trio  is  a  sample  of  what  occurs  in  practice;  unfortunately  the 
true  value  is  not  known.  Many  field  verification  schemes  assume  that 
observations  and  true  values  are  one  and  the  same.  This  entirely  invalid 
assumption  produces  equally  invalid  "verification  statistics". 
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Figure  1  is  a  comparison  of  frequency  distributions  as  function  of 
the  value  range.  Rescaling  F  to  match  Ifp  would  diminish  features  in  the 
field  continuum.  Rescaling  would  also  affect  other  significant  aspects  of 
the  field  continuum;  e.g.,  gradient  would  be  reduced  everywhere. 

What  other  properties  does  our  second  bivariate  sample  have?  The 
overall,  i.e.,  unstratified,  difference  between  pairs  is  zero: 


E 


(F  -  B)  =  0 


The  mean  squared  difference  is  non- zero: 


(4) 


(F  -  B)2 


1.643 


(5) 


It  should  be  noted  that  this  variance  of  field  versus  observed  values  is  a 
combination  of  two  contributing,  uncorreiated  variances:  the  subscale 
variance  of  observed  versus  true  which  was  calculated  in  the  preceding 
sample  to  be  1.143,  and  the  variance  of  field  versus  true  which  is  readily 
calculated  to  be  0.5.  In  equation  form. 


Np 

F,B 


B 


(F  -  B)  ‘ 


=  JrS  N 

"  B,T 


B,T 


(B  - 


T)2  + 


JfS  N  (F  - 
F,T  ' 


T)‘ 


(6) 


for  which  we  can  introduce  the  notation. 


a 


2 

F,B 


(7) 


1} 
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Figure  1  Comparison  of  frequency  distributions.  Relative  to  the  other 
distributions,  N_  versus  Bp  is  diminished  in  the  wings  of  the 
value  range.  r 
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3.3  Summary  and  Conclusions 


Our  first  example,  displayed  by  Table  2,  clearly  indicates  that 
stratification  by  observed  value,  B,  is  misleading.  Our  second  example. 
Tables  3  and  4,  demonstrates  that  stratification  by  field  value,  F,  is  also 
prone  to  misinterpretation  and  may  lead  to  abuse  of  field  values  by 
rescaling. 

The  only  stratification  which  bears  directly  on  the  verification  of  a 
field,  and  the  evaluation  of  the  capability  which  produced  it,  and  which 
gives  meaningful  resultants,  is  stratification  according  to  ranges  of  true 
value.  But  the  fieid-objective  true  values  are  generally  not  available. 

The  above  findings  do  not  negate  the  value  of  using  observations 
for  the  verification  of  fields.  Inappropriate  stratifications  have  been 
indicated  (and  indicted).  Observations  are  estimates  of  true.  A  cluster 
of  observations  will  tend  to  resolve  local  true  provided  they  are  sufficiently 
dispersed  in  the  space  and  time  ambience  to  be  representative  of  the  objective 
range  of  scale. 

Two  additional  considerations  should  be  kept  in  mind: 

(1)  Observations  include  variances  in  addition  to  the  subscale 
relative  to  true—  instrument/estimation  error  and  transcription/ 
transmission  error. 

(2)  Fields  are — should  be— produced  by  exploitation  of  all 
available  relevant  information.  The  blended  resultant  of 
the  available  information  can  implicate  some  individual 
observations  to  be  grossly  in  error,  or  to  be  non-representative 
of  the  objective  true  value. 
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4.  THE  VALUE  OF  SCATTER  DIAGRAMS  AND  FREQUENCY  TABLES 
4. 1  Inspection 

A  two-dimensional  scatter  diagram  of  the  bivariate  sample  is  a  plot 
of  points  in  the  coordinate  plane  of  B  and  F,  with  each  point  representing 
a  B,F  pair  of  values.  Because  of  the  rounding  of  values  and/or  the 
limitation  in  the  smallness  of  a  plotted  point,  a  multiplicity  of  points  may 
coincide.  This  multiplicity  can  be  shown  by  plotting  the  multiplicity 
number  instead  of  a  point  at  the  location,  and  the  scatter  diagram  then 
becomes  a  frequency  table.  A  frequency  table  is,  in  effect,  a  coarse 
scatter  diagram.  We  have  shown  several  artificial  examples.  Table  5  is 
a  real  example  taken  from  a  recent  Mil  report  [1]. 

Inspection  of  such  a  scatter  diagram  will  tell  if  anything  is  very 
wrong.  The  experienced  eye  may  be  able  to  pick  out  some  details. 

The  one-to-one  correspondence,  the  similarity  in  variances  in  observed 
and  diagnosed  values,  and  the  distribution  of  values  peaking  at  about  4  or 
5  meters /second,  are  faily  obvious  in  Table  5.  At  the  low  end  the  diagnosed 
wind  speeds  are  greater  than  observed,  and  vice  versa  at  the  high  end. 
There  is  also  evidence  of  a  subjective  (i.e.,  observer)  preference  for  the 
value  15,  and  against  the  value  17. 

Subscale  variance  affects  the  distribution  at  all  ranges.  It  is  an 
explanation  of  the  relationship  at  the  low  and  high  ends  of  Table  5. 

Stratification  of  the  sample  in  any  category  of  observed  value  above 
5  meters /second  would  show  the  corresponding  mean  field  value  to  be 
smaller  than  the  observed  value,  whereas  stratification  in  any  category  of 
field  value  above  5  meters /second  would  show  the  field  value  to  be  larger 
than  the  corresponding  mean  observed  value.  Such  stratification  is 
related  to  fitting  a  straightline  to  the  data  in  the  "regression"  sense, 
minimizing  the  mean  square  distance  of  all  points  from  the  line,  measuring 
distance  along  one  coordinate  or  the  other,  depending  on  which  is 
considered  to  be  the  dependent  variable. 


-19- 


Diagnosed  Wind  Speed  (meters/second) 


Table  5 

Scatter  diagram  showing  observed  wind  versus  diagnosed  wind 
by  class.  Numbers  on  scatter  diagram  show  the  lower-bound 
(in  thousands)  of  occurrences  in  each  class.  Over  2.3  million 
pairs  of  ship  reports  and  corresponding  diagnosed  winds  were 
utilized  in  constructing  this  scatter  diagram. 
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4.2  The  One-to-One  Relationship 


The  overall  relationship  between  observed  values  and  field  values 
can  be  examined  by  fitting  a  best  straightiine  to  all  the  points  in  a  scatter 
diagram.  The  best  straightline  is  defined  as  that  which  minimizes  the  mean 
square  distance  of  all  the  points  from  the  line,  measuring  the  shortest,  i.e. , 
line-normal,  distances  from  points  to  line. 

The  line  can  be  restricted  to  pass  through  the  origin,  B  =  F  =  0. 

If  we  define  the  slope  of  this  line  as  tand  =  6F/6B,  then  the  best  fit  is 
given  by 


tan  20 


(8) 


The  formula  is  more  involved  if  the  restriction  of  passing  through  the 
origin  is  lifted.  In  the  case  of  Table  4,  Eq.  (8)  gives  tan9  =  0.982. 

The  best  straightiine  fit,  however,  does  not  give  specific  information 
as  to  biases  and  variances. 


4.3  Diagnosis  Based  on  a  Model 


The  frequency  distribution,  the  table  of  Np  g  counts,  can  be  diagnosed 

in  terms  of  a  model  to  obtain  corresponding  estimates  of  the  frequency 

2  2 

distribution,  N^,  the  variances  <7g  j  and  Op  j,  and  the  bias  of  F,  in  each 
category  of  true,  T.  We  specify  the  model  by  the  equations 


B 


T 


=  T 

+  a  random  component  normally  distributed  with  a 
variance  which  is  a  function  of  the  category  T. 


(9) 


+  a  bias  which  is  a  function  of  the  category  T 

+  a  random  component  normally  distributed  with  a 

variance  which  is  a  function  of  the  category  T.  (10) 
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Negative  values  of  By  and/or  Fy,  as  given  by  Eqs.  (9)  and  (10),  are  to 
be  added  to  their  corresponding  positive  cells. 

In  this  model  there  are  four  unknowns  per  category  T :  the  frequency, 

2  2 

Ny,  the  normally-distributed  variances,  og  y  and  <7p  y,  and  the  bias  of 
Fy  relative  to  T.  The  frequencies,  Np  g,  are  the  given  resultants, 
resulting  from  all  categories  of  T  distributed  according  to  Eqs.  (9)  and 
(10).  In  the  example  of  Table  5,  there  are  20  x  4  unknowns,  and  20  x  20 
given  values  of  Nc  Q.  The  objective  is  to  determine  a  best-fit  solution. 

Let  the  best  fit  yield  frequency  distributions  denoted  by  Np  Q.  The 
best-fit  solution  is  that  which  minimizes  the  quantity 


However  development  of  this  diagnosis  model  lies  beyond  the  scope 
of  the  present  study. 
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5. 


THE  FUNDAMENTAL  APPROACH  TO  FIELDS 


5. 1  Information,  and  the  Quality  of  Fields 

Let  us  now  step  back  a  few  more  paces  in  order  to  take  a  broad  and 
fundamental  view  of  fields.  A  numerical  field  is  supposed  to  represent  a 
composite  rendering  of  all  available  relevant  information.  All  information 
stems  ultimately  from  observations. 

The  information  from  which  a  current  field  may  be  produced  may  be 
classified  into  three  categories  according  to  their  time  factor  relative  to 
the  current  field: 

Source  1 :  Concurrent  observations — observations  taken  within  the  applicable 
time  span  of  the  current  field. 

Source  2:  Near-past  observations — observations  which  were  taken  at  times 
preceding  the  current  field  but  recently  enough  to  have  a  direct 
bearing  on  the  current  field.  This  information  can  be  carried 
along  the  time  axis,  in  field  form,  by  extrapolation  or  prediction 
techniques,  up  to  the  applicable  time  of  the  current  analysis. 

Source  3:  Well-past  observations.  This  source  is  the  basis  for  all  types  of 
physical  and  statistical  relationships  and  other  organizations  of 
history.  All  physical  equations  establish  approximate  relationships 
between  parameters,  based  on  past  observations  of  these 
parameters.  In  a  very  true  sense  a  physical  equation  is  a 
compact  representation  and  essence-extracting  distillation  of 
the  relationship  between  all  prior  observations  of  the  included 
parameters;  as  such  a  physical  equation  represents  information. 

The  information  represented  by  a  field  does  not  just  provide  an 
estimate  of  the  absolute  value  of  the  object  parameter  at  every  point  in  the 
field.  Resolution  of  shape  parameters,  such  as  gradient,  in  the  field- 
objective  scale  of  resolution,  may  be  more  significant.  Shape  parameters 
resolve  spatial  features;  patterns  such  as  troughs  may  be  of  the  essence. 
Information  for  the  production  of  a  field  also  comes  in  various  forms  in 
each  of  the  three  source  categories  cited  above.  Information  comes  in  the 
form  of  estimates  of  the  object  parameter  at  locations  in  the  field,  in  the 
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form  of  gradients  and  other  shape  parameters  derived  from  scanning 
observation  systems  or  from  related  fields,  and  in  many  complex  forms. 

The  verification  of  fields,  and  the  evaluation  of  production  methods, 
should  be  concerned  with  all  aspects  of  field  significance — with  how  well 
the  production  capability  accommodates  each  item  of  assimilated  information 
and  with  the  qualities  of  the  field. 

Observations  provide  estimates  of  the  field-objective  true  values  at 
scattered  locations  in  the  field.  Verification  often  is  concerned  only  with 
this  one  aspect  of  the  field.  Such  verification  is  incomplete.  Observations 
may  conflict  with  other  types  of  local  information  assimilated  in  the 
production  of  the  field. 

Fields  should  not  include  extraneous  features — features  which  are 
unsupported  by  information.  Among  several  fields  which  accommodate  the 
total  assimilated  information  equally,  that  field  having  the  smallest  total 
variability  is  the  superior  field. 

Predicted  fields  include  no  concurrent  observations  of  any  type. 
Diagnosed  fields  include  no  direct  concurrent  observations  of  the  object 
parameter.  Analyzed  fields  generally  include  assimilation  of  some 
concurrent  observations  of  the  object  parameter,  but  additional  observations, 
not  available  at  the  time  of  field  production,  may  later  become  available  for 
verification  purposes.  We  shall  term  included  observations  as  dependent, 
and  those  not  included  as  independent  observations. 

An  observation  may  not  be  representative  in  the  objective  range- 
of-scale  of  the  field.  The  observation  may  be  non-representative  due  to 
the  presence  of  a  subscale  anomaly  in  the  field.  Field  analysis  capabilities 
which  include  observation  evaluation  functions  may  reject  an  observation 
in  producing  an  analysis  for  one  objective  range  of  scale  and  may  accept 
the  same  observation  when  analyzing  to  a  finer  scale  for  the  same 
subregion.  There  is  a  gray  zone  in  which  no  method  can  differentiate 
between  an  erroneous  observation  and  an  observation  that  has  been 
strongly  affected  by  a  subscale  feature.  The  point  to  note  is  that 
individual  observations  vary  in  value  relative  to  a  field. 


-24- 


5. 2  Fields  by  Information  Blending 

The  Fields  by  Information  Blending  (FIB)  methodology  is  based  on 
associating  a  reliability  with  every  item  of  information  and  with  the 
quantitative  assembly  and  blending  of  various  types  of  weighted  estimates 
for  maximizing  the  information  yield  of  the  field.  Details  about  FIB  and 
its  many  applications  can  be  found  in  various  reports.*  Some  aspects 
are  particularly  relevant  in  the  present  context. 

The  reliability,  or  weight,  of  an  estimate  is  defined  as  the  inverse 
of  the  variance  associated  with  that  estimate  relative  to  the  field-objective 
true  value  of  that  estimate.  Uncorrelated  contributing  variances  are 
additive : 

2  2  2  2 
°  -  °  (observation)  +  a  (subscale)  +  o  (transmission) .  (12) 

The  subscale  variance  component  is  a  function  of  the  field-objective  lower 
range  of  scale. 

In  specific  applications  information  in  a  variety  of  field  properties 

(value,  gradient  components,  etc.)  are  assembled  and  blended.  Each 

assembled  field  property  has  its  own  associated  weight  field  (A,  B,  C,  etc.). 

The  blended  resultant  field  also  has  an  associated  weight  field  for  each 
*  *  * 

property  (A  ,  B  ,  C  ,  etc.);  because  of  economy  considerations  they  are 
not  always  computed.  But  whether  or  not  computed  they  are  defined  in 
association  with  the  product  field. 

Consider  an  observation,  H  ,  with  reliability,  A  .  which  has  been 

n  n 

included  in  the  assembly  and  blending.  Denote  the  local  field  resultant  for 
that  property  by  H*  of  reliability  A*.  The  independent  background 
information  for  evaluating  the  observation  is  given  by 

Ar  =  A*  -  A  (13) 

o  n 


H 


B 


*  * 

AH  -AH 
n  n 


A 


B 


(14) 


*The  most  complete  account  to  date  of  a  FIB-based  analysis  system 
is  to  be  found  in  Reference  [  4] . 
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The  standard  deviation  between  H0  and  H  is  given  by 

d  n 


+  Ac 


We  denote  the  actual  difference,  expressed  in  units  of  the  standard 
deviation,  by  An: 


(hb 

'  «„)2 

A"Ab  (h  -h\ 

K' 

*  o 

An  +  Ael  8  «) 

The  reevaluation  of  a  report  is  based  on  the  value  of  A  •  If  A  is  less 

than  one,  or  in  the  vicinity  of  one,  its  weight  is  left  as  purported  for 

the  class  of  observations.  If  A  is  considerably  larger  than  one  the 

purported  weight,  A^,  is  reduced  by  proportionate  formula  to  the  point 

of  extinction  (i.e.,  report  rejection)  at  the,  somewhat  arbitrary,  value 

of  A  =  2.5.  In  a  normal  distribution  about  2%  of  A  values  exceed  2.5; 
n  n 

these  are  either  subscale  mavericks  or  gross  errors. 


It  should  be  noted  that  the  reevaluation  depends  on  how  much 
independent  background  information  is  locally  available.  For  a  given 
absolute  difference,  Hg  -  Hn,  a  larger  Ag  produces  a  larger  An  up  to 
the  limit  of 


=  A 


"(H‘ 


A  smaller  A„  produces  a  smaller  A  down  to  the  limit  of  An  =  0.  The 
result  of  the  inclusion  of  this  reevaluation  process  is  that  isolated 
observations  far  removed  from  any  independent  conflicting  information 
will  be  fitted  most  closely  in  the  absolute  sense.  The  assimilation  of 
all  available  relevant  information  in  addition  to  observations  of  the  object 
parameter  will  result  in  the  observations  being  fitted  less  well  in  the 
absolute  sense.  The  conclusion  to  be  drawn  from  all  this  is  that 
verification  should  not  be  based  on  any  dependent  observations — but 
only  on  independent  observations. 


5.3  Verification 

if  a  field  has  been  produced  by  an  application  of  the  FIB  methodology 
then  estimates  of  the  associated  product  reliability  fields  can  be  calculated 
at  considerable  computational  costs.  The  verification  of  a  field  may  be 
intended  for  the  purpose  of  checking  such  reliability  fields,  but,  generally, 
verification  is  intended  to  assess  field  reliability  where  no  associated 
product  reliabilities  are  available. 

Only  independent  observations  of  the  object  parameter  should  be 
used  to  verify  a  field  in  the  limited  absolute  sense.  Let  an  observation  be 
denoted  by  Bn,  the  corresponding  field  value  by  F  ,  and  the  corresponding, 
but  unknown,  field-objective  true  value  byT^.  For  an  adequately  large 
sample  the  variances  are  related  by  Eq.  (6): 

h  -  T„)2  -  k  L(F„  -  Bn)2  -  b  £(»„  -  T„)2  .  08, 

The  left-hand  side  expresses  the  variance  of  the  field  relative  to  true. 

The  first  term  on  the  right-hand  side  can  be  evaluated  for  the  sample. 

The  second  term  can  be  based  on  the  purported  variance  of  the  class  of 
observations  relative  to  the  field-objective  true  values,  as  expressed  by 
Eq.  (12). 
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6. 


THE  0PL2X  VERIFICATION  STATISTICS  PACKACE 


6. 1  Introduction 

Study  of  the  literature  concerning  field  verification  techniques 
quickly  reveals  that  little  thought  has  been  given  to  the  fundamental 
questions  of  what  a  numerical  field  actually  represents,  what  constitutes 
a  good  field,  and  how  the  quality  of  a  numerical  field  may  be  judged. 

As  we  have  seen  in  previous  Sections,  lack  of  appreciation  of  these 
issues  can  lead  to  "verification  schemes"  which  either  are  statistically 
invalid  or  which  produce  results  prone  to  misinterpretation.  For  example 
a  scheme  based  entirely  on  stratification  of  a  bivariate  sample  of  observed 
and  field  values  into  categories  of  observed  values  produces  results  which, 
in  a  mathematical  sense,  are  correct.  The  fault  occurs  when  these  results 
are  interpreted  as  evidence  of  a  bias  in  the  field  values.  It  also  will  be 
appreciated  that  use  of  a  regression  equation  can  lead  to  the  discovery 
of  biases  which  actually  are  non-existent.  Clearly  an  invalid  verification 
technique  (or  one  for  which  the  results  are  routinely  misinterpreted)  is 
worse  than  no  verification  scheme  at  all. 

To  carry  out  field  verification  in  a  meaningful  and  informative 
manner  is  not  a  simple  task.  Stratification  in  categories  of  field  value 
is  relevant  but,  as  explained  in  Section  3,  the  reults  are  prone  to 
misinterpretation.  The  only  stratification  which  gives  desired  specifics 
as  functions  of  value  is  in  ranges  of  true  value — but  the  corresponding 
true  values  generally  are  not  available. 

Verification  actually  involves  a  trivariate  distribution.  Although 
the  true  values  corresponding  to  the  bivariate  sample  of  field  and 
observed  values  are  not  available  they  could  be  diagnosed  by  a  model. 

Such  a  model  is  defined  in  Section  4.3.  The  results  of  the  diagnosis 
would  include  field  bias  and  variance,  and  observed  variance,  in  ranges 
of  true  value.  However  the  development  of  the  formulation,  including  a 
method  for  solving  the  resulting  system  of  equations,  will  not  be 
straightforward.  The  problem  lies  in  modifying  the  formulation  to  make 
the  system  at  least  quasi-linear  so  that  it  can  be  solved  iteratively  in 
terms  of  an  inner  linear  system.  Also,  the  assumption  of  normally- 
distributed  variance  will  have  to  be  changed  to  suit. 
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At  this  time  development  and  utilization  of  a  diagnostic  model  for 
routine  operational  use  is  not  recommended.  It  is  doubtful  that  such 
diagnosis  would  generally  show  enough  of  interest  in  the  way  of 
peculiarities  to  warrant  the  diagnostic  computations.  Besides  being 
complex  and  costly  it  would  verify  only  one  aspect  of  a  field — proximity 
to  true — and  hence  would  give  that  single  aspect  of  field  quality  total 
importance.  As  explained  in  Section  5  there  is  much  more  to  a  field  than 
mere  proximity  to  the  categorical  true  value,  and  much  more  information 
than  recent  and  current  observations  should  go  into  the  production  of 
a  field.  At  this  time  diagnostic  models  are  seen  as  special-purpose 
programs  to  be  applied  to  data  samples  drawn  from  collections  of  fields 
for  diagnosis  of  specific  field  production  capabilities  of  the  diagnostic 
and  prognostic  types.  For  example  they  may  prove  very  useful  for 
evaluating  various  types  of  quasi-geostrophic  wind  transforms,  or 
ocean-wave  generation  models. 


It  is  clearly  desirable  to  have  a  capability  for  the  routine  verification 
of  FNOC  fields.  As  explained  in  Section  4,  frequency  tables  which 
summarize  the  bivariate  sample  are  valuable  aids  to  field  verification. 

These  tables  are  useful  whether  applied  to  dependent  or  independent 
observations — the  samples  should  be  clearly  distinguished.  Note  especially 
that  stratification  by  observed  values  or  by  field  values  should  be  avoided. 
The  output  tables  must  include  cautionary  comments  with  regard  to 
interpretation  to  prevent  the  User  from  "discovering"  that  field  values 
(and/or  observed  values)  are  biased. 

The  overall  mean  difference 


b  *  FrL^n-V  - 

and  the  variance 

=  fr  -  B,,)2  •  <“> 
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are  very  significant  in  the  case  of  independent  observations.  According 
to  Eq.  (18), 


aF,B  =  aF,T  +  °B,T  •  *21* 

Clearly,  if  a  change  is  made  to  a  field  production  capability  which  results 

in  a  systematic  reduction  of  the  mean  difference,  b,  and  the  variance, 

2 

Op  g,  then  that  capability  has  been  improved  in  its  ability  to  approximate 

the  categorical  field-objective  true  values.  Note  however  that  the 

2 

calculated  variance.  Op  g,  is  lower-bounded  by  the  subscale  variance, 

in  the  limit  of  a  perfect  field.  The  mean  difference  is  a  check  on 
the  overall  bias.  The  resultants,  b  and  Op  g,  may  be  usefully  stratified 
by  geographical  regions. 


6.2  Outline  of  the  QPL2X  Program 

The  underlying  features  and  capabilities  of  a  soundly-reasoned  field 
verification  scheme  for  FNOC  products  have  been  outlined  above. 

The  OPL2X  package  produces  field  verification  statistics1  based  on 

comparison  with  concurrent  report  values.  Bivariate  samples  of  report 

values,  B,  paired  with  the  corresponding  field  values,  F,  obtained  by 

2  3 

field  interpolation,  are  prepared,  sorted,  and  analyzed  for  each  field. 

The  OPL2X  package  carries  out  the  following  analyses  of  each 
bivariate  data  sample: 


Although  designed  specifically  for  fields  produced  by  the  operational 
FNOC  system,  OPL2X  is  equally  applicable  (with  some  program  modification) 
to  products  generated  elsewhere. 

2 

In  order  to  remove  gross  errors  and  subscale-maverick  values  from 
the  observations,  the  sample  is  sorted  on  the  magnitude  I  F  -  b|  in 
decreasing  order.  A  specifiable  percentage  (e.g.,  2%)  is  withheld  from 
the  top  of  this  list. 

^ind  fields  are  verified  in  terms  of  two  scalar  components,  the  wind 
speed  and  the  wind  direction. 
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a.  Compilation  of  the  frequency  table  of  observed  (B)  and 
field  (F)  value  pairs. 

b.  Calculation  of  row  and  column  frequency  totals,  mean  values 
and/or  mean  differences. 

c.  Calculation  of  the  slope  of  the  best-fit  straightline  (through  the 
origin)  to  the  sample  in  the  F,B  plane. 

d.  Calculation  of  the  overall  mean  difference,  F  -  B,  and  the 
variance  and/or  standard  deviation. 

e.  Stratification  of  the  data  sample  by  latitude  zones. 

The  OPL2X  package  produces  a  frequency  table  based  on  the  scatter 
diagram  of  report  values,  B,  versus  field  values,  F,  stratified  into  ranges 
of  B  and  of  F.  This  stratification  of  the  B,F  coordinate  plane,  illustrated 
by  Fig.  2a,  is  the  basis  of  several  useful  row  and  column  summaries. 
However  certain  features  of  the  bivariate  scatter  are  not  well  revealed  by 
this  frequency  table.  The  table  does  not  give  a  clear  indication  of  the 
difference  distribution,  F-B. 


F 


B 


F 


Figure  2a 


Figure  2b 


A  second  frequency  table  therefore  is  provided,  based  on  a 
different  stratification  of  the  B,F  coordinate  plane,  as  shown  by  Fig.  2b. 
This  frequency  table  is  printed  in  an  orientation  which  represents  a  45- 
degree  clockwise  rotation  of  Fig.  2b,  with  ^(F+b)  as  abscissa  and  (F-B) 

1  r  '  \  ' 

as  ordinate,  as  shown  by  Fig.  3.  While  WF+Bj  must  be  stratified  to 
cover  the  full  range  of  the  parameter,  the  ordinate  F-B  may  be  stratified 
to  best  reveal  the  relevant  scatter. 


2 


Figure  3 

As  currently  configured,  OPL2X  can  produce  verification  statistics 
for  the  following  fields: 

a.  Wind  speed  and  wind  direction. 

b.  Wave  height  and  wave  period. 

c.  Sea-level  pressure. 

d.  Sea-surface  temperature. 

OPL2X  may  be  run  daily  with  output  directed  to  fiche  for  hard-copy 
retention,  and  to  a  disc-resident  file  for  temporary  system  storage  for  use 
in  compiling  weekly  summaries  and  for  other  processing.  The  disc-resident 
file  is  circular,  holding  up  to  31  days  of  data. 
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On  a  once-per-week  basis  the  daily  run  normally  is  followed  by  a 

weekly-summary  run,^  with  output  directed  to  printer.  The  frequency 

tables  represent  the  accumulated  bivariate  sample  for  each  field  parameter. 

The  daily  desiderata— including  the  mean  difference  (F-B),  mean  absolute 

difference  j  F-b|  ,  the  variance,  and  the  straightline-through-origin  F/B 

slope — are  listed  in  individual  columns  with  each  row  corresponding  to 

each  day  of  the  week,  producing  a  table  for  each  field  parameter.  With 

2 

each  summary  run,  a  Product  Interpretation  Cuide  also  is  output. 

In  order  to  facilitate  use  of  the  output  all  necessary  information  for 
a  complete  index  in  each  run  of  0PL2X  is  produced  as  the  last  part  of 
the  output.  The  index  is  stored  in  a  form  that  can  optionally  be  sorted 
by  page  number,  parameter  type,  time,  or  grid  region,  as  desired  by  the 
User.  More  than  one  type  of  sorted  index  may  be  output  in  a  run. 

Other  features  of  the  output  are  apparent  from  the  examples  given 
in  the  following  Section. 

6.3  Output  Generated  by  OPL2X 
6.3.1  Introduction 

This  Section  provides  samples  of  field  verification  statistics  and  other 
information  output  by  OPL2X.  These  samples  are  selected  from  a  daily  run 
made  on  08  JAN  80  followed  by  a  summary  run. 


^Summary  runs  actually  may  be  produced  with  any  daily  run. 
2 

See  following  Section. 


6.3.2  Product  Interpretation  Guide1 

GENERAL  COMMENTS:  Concepts  relating  numerically-expressed  fields  and 
concurrent  observations  have  been  developed  by  M.  M.  Holl.  (See,  in 
particular,  "The  Verification  of  Fields  Based  on  Comparison  with  Concurrent 
Observations  (A  Critical  Review  of  OPL2)",  Meteorology  International 
Incorporated,  S.  O.  7R-13,  Contract  Number  N00228-78-D-4316,  October  1978.) 
The  analyzed  verification  statistics  which  follow  must  be  interpreted  in  terms 
of  such  concepts.  The  statistics  are  of  use  in  evaluating  fields,  and  the 
capabilities  which  produced  the  fields,  in  limited  senses.  The  User  is 
cautioned  to  avoid  certain  conclusions  which  may  appear  at  first  sight  to 
be  intuitive  but  which  are  in  fact  erroneous.  Notes  to  guide  interpretations 
and  precautions  follow  in  appropriate  contexts. 

A  numerically-expressed  field  is  an  estimate  of  an  objective  distribution, 
the  desired  "true"  distribution,  which  must  be  defined  in  terms  of  an  effective 
time  span  for  which  the  distribution  is  representative,  a  lower-bounded 
objective  range  of  scale,  and  the  significant  facets  of  variability.  Synoptic 
observations  themselves  are  representative  of  subregions  in  space  and  time. 
The  field  is  generally  desired  for  resolution  of  not  only  parameter  value  but 
also  of  gradient  and  other  characteristics  of  shape.  Information  for  resolution 
of  these  various  facets  of  the  distribution  comes  in  a  variety  of  forms. 

Reports,  i.e.,  direct  observations  and  measurements  of  the  object 
parameter  value,  estimate  only  local  values  of  the  distribution.  These 
estimates  include  subscale  and  error  variances  relative  to  objective  true 
value.  Further,  evaluation  of  a  field  using  only  such  reports  does  not 
directly  assess  the  resolution  of  the  desired  elements  of  shape  in  the 
objective  space  and  time  scales. 

The  concurrent  reports  which  are  used  to  verify  a  field  may  be 
dependent,  i.e.,  they  have  already  been  assimilated  in  the  production  of 
the  field,  or  they  may  be  independent  of  the  field.  If  independent,  then 
it  is  generally  true  that  the  better  they  match  the  corresponding  field 
values  the  better  is  the  field,  at  least  in  this  one  respect.  However  if 
dependent  then  a  better  match  does  not  necessarily  imply  a  better  field; 

^This  Cuide  normally  forms  the  first  part  of  a  summary  run.  The 
numbers  (1)  through  (19)  in  the  Guide  refer  to  elements  of  the  output 
statistics  which  are  correspondingly  identified. 


resolution  of  the  significant  elements  of  shape  may  be  deteriorated  by 
fitting  reports  too  closely. 

Fields  may  be  prognostic  (based  on  prior  observations  of  all  kinds 
expressed  in  the  form  of  reports  and  models);  diagnosed  (based  on  prior 
observations  and  concurrent  analyses  of  related  parameters,  but  excluding 
concurrent  reports  of  the  object  parameter) ;  or  analyzed  (based  on  all 
available  relevant  information  including  concurrent  reports  of  the  object 
parameter) . 

Analyzed  fields  may  be  verified  using  dependent  concurrent  reports  but 
such  verification  reflects  more  on  the  assimilation  scheme  which  produced  the 
field  than  it  does  on  field  evaluation.  A  well-produced  analysis  should  be  an 
optimized  accommodation  of  all  assimilated  pieces  of  information  which  are 
related  to  parameter  value  and  to  elements  of  shape.  All  pieces  of  information 
include  variance  relative  to  the  objective  true  distribution.  Optimization 
implies  that  there  are  generally  many  more  pieces  of  information  than  there 
are  degrees  (e.g.,  number  of  grid-point  values)  of  accommodation  in  the 
numerically-expressed  field.  It  is  not  generally  desirable  to  minimize  residual 
variance  in  the  accommodation  of  reports  at  the  expense  of  increasing 
residual  variances  in  the  accommodation  of  shape  information. 

The  evaluation  of  a  field  requires  independent  reports.  Analyzed 
fields  may  be  evaluated  by  using  reports  which  were  not  made  available  to 
the  analysis. 

The  verification  statistics  are  based  on  limited  analysis  of  bivariate 
distributions:  report  values,  B,  are  paired  with  corresponding  field  values, 
F,  obtained  by  field  interpolation  based  on  a  field  interpretation  scheme. 

A  third  member  associated  with  each  pair,  but  which  is  generally  unknown, 
is  the  corresponding  true  value,  T. 

The  variance  of  F  relative  to  T  would  be  a  measure  of  the  absolute 
resolution  accuracy  of  the  field.  Since  T  is  not  available,  the  variance 
of  F  relative  to  B  is  calculated.  The  adjusting  relationship  is: 

mean  square  (F  -  T)  =  mean  square  (F  -  B)  -  mean  square  (B  -  T)  . 

The  second  term  on  the  right-hand  side  is  the  subscale  variance  which 
limits  the  accommodation  of  reports  by  the  field.  The  subscale  variance 
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can  be  estimated  for  a  class  of  reports  and  the  space  and  time  scales  of 
the  field.  It  is  desirable  to  exclude  reports  which  are  in  gross  error,  or 
appear  to  be  doubtful  mavericks,  from  the  above  relationship.  Such 
reports  should  also  be  rejected  by  the  analysis  capability. 

THE  STATISTICAL  ANALYSIS:  Bivariate  samples  of  report  values,  B,  and 
the  corresponding  field  values,  F,  obtained  by  field  interpolation,  are 
prepared,  sorted,  and  analyzed  for  each  field.  Wind  fields  are  verified 
in  terms  of  two  scalar  components,  the  wind  speed  and  the  wind  direction. 
The  following  statistical  quantities  are  calculated  for  each  bivariate  sample: 

(1)  FREQUENCY  DISTRIBUTION  No.  1: 

A  table  showing  the  frequency  distribution  of  the  bivariate 
sample  according  to  categories  of  range.  In  effect  the  table 
is  the  same  as  a  coarse  scatter  diagram  in  the  B,F  coordinate 
plane  with  B,  the  observed  value,  as  abscissa,  and  F,  the 
field  value,  as  ordinate.  Each  element  in  the  body  of  the 
table  expresses  the  number  of  pairs  which  fell  within  the  range 
limits  of  the  cell. 

(2)  -  (7)  The  center  values  of  range  intervals  are  shown  along  the 

base  (2)  for  report  value;  the  same  ranges  are  shown  to  the 
left  (3)  of  the  table  for  field  value.  For  parameters  which 
are  lower  bounded  at  zero  (e.g.,  wind  speed)  a  zero  represents 
a  range  interval  which  is  half  of  the  others.  The  frequency 
distribution  of  report  values  is  shown  (4)  below  the 
corresponding  ranges.  These  numbers  correspond  to  the 
column  totals.  The  frequency  distribution  of  field  values  is 
shown  (5)  to  the  left  of  the  corresponding  ranges.  These 
numbers  correspond  to  the  row  totals.  The  mean  differences, 
field  minus  report,  are  shown  (6)  for  the  range  categories 
of  report  values.  The  mean  differences,  report  minus  field, 
are  shown  (7)  for  the  range  categories  of  field  values. 


In  the  case  of  wind  direction,  in  calculating  mean 
differences,  (6)  and  (7),  any  pair  difference  which  exceeds 
+180  degrees  is  lowered  by  subtracting  360  degrees,  and  any 


pair  difference  less  than  -180  degrees  is  raised  by  adding 
360  degrees.  This  reduces  the  individual  differences  to  the 
-180  to  +180  degree  range. 

The  overall  mean  difference,  i.e.,  the  mean  value  of  (F  -  B). 

It  should  be  noted  that  all  differences  which  exceed  a  prescribed 
upper  bound  in  magnitude  are  omitted  from  this  calculation  in 
order  to  exclude  erroneous  reports.  This  compensates  for  the 
fact  that  gross  errors  have  not  been  culled  from  the  list  of 
reports . 

If  the  reports  are  dependent,  and  if  the  analysis  field  is 
based  on  assimilation  of  no  other  information  which  bears  directly 
on  the  value  of  the  parameter,  then  it  would  seem  reasonable 
to  assume  that  the  field  could  be  improved  by  subtracting  the 
calculated  mean  from  the  field  everywhere.  This  may  be 
appropriate  for  some  analysis  capabilities  which  give  all  reports 
equal  weight.  However  a  word  of  caution  is  in  order.  Powerful 
comprehensive  analysis  capabilities,  such  as  those  based  on  the 
FIB  methodology,  reevaluate  the  individual  reports  which  are 
assimilated  in  the  analysis  process.  Those  reports  which  are 
most  at  variance  with  other  direct,  independent,  local 
information  are  downweighted  in  successive  refinements  of  the 
analysis.  This  can  result  in  a  calculated  mean  (unweighted) 
difference  which  does  not  represent  any  possible  improvement 
to  the  field. 

The  overall  variance,  i.e.,  the  mean  value  of  (F  -  B)  squared. 
Another  procedure  is  followed  here  to  avoid  contamination  by 
erroneous  reports.  The  list  of  differences  is  first  sorted  in 
order  of  magnitude.  A  specified  percentage  of  the  total 
number  of  reports  is  omitted  from  the  top  of  this  sorted  list, 
and  the  mean  square  is  calculated  for  the  differences  that 
remain.  This  calculated  mean-square  value  is  then  multiplied 
by  an  adjustment  factor,  greater  than  one,  based  on  a  normal 
distribution,  to  produce  an  effective  variance. 


The  interpretation  of  this  calculated  variance  involves 
appreciation  of  the  subscale  of  parameter  variability  relative 
to  the  objective  true  distribution.  The  calculated  variance 
is  lower  bounded  by  the  subscale  variance.  The  calculated 
variance  should  be  diminished  for  an  analysis  which  purports 
to  resolve  a  finer  scale. 

(10)  The  slope  of  the  best-fit  straightline  which  also  passes  through 

the  origin.  The  overall  relationship  between  observed  values 
and  field  values  can  be  examined  by  fitting  a  best  straightline 
to  all  the  points  in  a  scatter  diagram.  The  frequency  table  is 
a  coarse  version  of  the  scatter  diagram.  The  best  straightline 
is  defined  as  that  which  minimizes  the  mean  square  distance 
of  all  the  points  (the  B,F  pairs)  from  the  line,  measuring  the 
shortest  (i.e.,  line-normal)  distance  from  point  to  line.  The 
line  can  also  be  restricted  to  pass  through  the  origin,  B  =  F  =  0 
The  tangent  (10)  of  this  line  is  calculated;  ideally  its  value  is 
unity.  Disparity  from  one,  however,  caunot  be  directly 
interpreted  in  terms  of  overall  bias.  The  calculation  of  slope 
is  also  not  appropriate  for  wind  direction;  the  origin  has  no 
special  significance  as  it  does  in  the  case  of  wind  speed  or 
wave  height.  ^ 

(11)  FREQUENCY  DISTRIBUTION  No.  2: 

This  second  tabulation  of  the  bivariate  sample  gives  a  clearer 
representation  of  the  one-to-one  relationship  between  observed 
and  corresponding  field  values.  It  results  from  a  45  degree 
clockwise  rotation  of  the  orientation  of  the  B,F  scatter  plane 
on  which  Table  No.  1  is  based.  The  axis  of  the  mean  value, 
(F+B)/2,  becomes  the  abscissa,  and  the  axis  of  the  difference, 
(F-B),  becomes  the  ordinate.  Each  of  these  two  axes  may  be 
independently  scaled  into  discrete  range  intervals.  Each 
element  in  the  body  of  the  table  expresses  the  number  of 
pairs  which  fall  within  the  range  limits  of  the  cell. 

(12)  -  (18)  The  center  values  of  range  intervals  are  shown  along  the 

base  (12)  for  mean  value,  and  in  the  column  to  the  left  (13) 
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for  the  difference  value.  For  parameters  which  are  lower 
bounded  at  zero  (e.g.,  wind  speed)  a  zero  represents  a  range 
interval  which  is  half  of  the  others.  The  frequency  distribution 
of  mean  values  is  shown  (14)  below  the  corresponding  ranges. 
These  numbers  correspond  to  the  column  count  totals.  The 
frequency  distribution  of  difference  values  is  shown  (15)  to 
the  left  of  the  corresponding  ranges.  These  numbers 
correspond  to  the  row  count  totals.  The  mean  differences 
are  shown  (16)  for  the  ranges  of  mean  value. 

INTERPRETATION  OF  THE  TABLE:  For  the  ultimate  ideal,  all  pairs  fall 
into  cells  lying  along  the  diagonal  of  the  table;  all  cells  not  touching  the 
diagonal  would  be  empty.  This  perfection  requires  the  resolution  to  be 
so  fine  as  to  make  the  subscale  insignificant.  The  reports  must  be 
representative  of  the  applicable  time  span,  without  bias  or  variance.  The 
field  values  match  the  reports,  and  are  the  desired  objective  true  values. 

in  order  to  appreciate  what  happens  as  we  leave  the  ideal  it  is 
necessary  to  note  the  natural  frequency  distribution  of  the  object  parameter. 
For  example,  consider  the  distribution  of  the  marine  wind  in  equal  ranges 
of  wind  speed.  The  frequency  peaks  at  about  five  meters  per  second,  falling 
off  toward  zero  and  toward  higher  speed.  The  frequency  distribution  for 
wave  height  is  similar. 

Now  let  a  significant  subscale  enter  the  picture.  Each  population 
center  which  lies  on  the  diagonal  for  the  ideal  distribution  is  now  spread 
left  and  right,  within  each  row,  in  a  normal  distribution,  without  bias. 
Because  the  spread  is  symmetrical  about  the  diagonal  the  mean  value  for 
each  row  remains  unchanged:  The  mean  differences  (9)  remain  zero. 

The  field  remains  true.  However  the  mean  differences  for  the  columns  (8) 
are  changed  because  the  natural  frequencies  of  occurrence  are  uneven. 

In  ranges  above  the  peak  occurrence,  the  column-mean  field  values 
show  a  deficit  relative  to  the  observed  values  in  the  category  range;  below 
the  peak  a  surplus  is  shown.  This  result  can  be  entirely  attributed  to 
the  subscale  variance  and  the  natural  frequency  distribution  of  the  parameter. 
This  evidence  alone  is  insufficient  for  evaluating  the  field  production 
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capability.  Contrary  to  earlier  versions  of  OPL2  which  labelled  these 
differences  as  field  error  they  should  not  be  interpreted  as  such.  These 
mean  differences  (8)  occur  even  in  the  case  of  a  field  which  is  true. 

Next,  introduce  variances  in  the  field  values  relative  to  the  true 
values,  but  without  bias.  Each  population  center  of  true  values  is 
symmetrically  spread  out  within  the  column  in  a  normal  fashion.  The  mean 
differences  (9)  for  categories  of  field  values  now  become  non-zero.  The 
field  values  show  a  deficit  relative  to  the  observed  values  in  ranges  of 
field  values  above  the  naturally  occurring  most  frequent  range,  and  a 
surplus  in  ranges  below  the  most  frequent.  It  is  wrong  to  interpret  these 
differences  as  field  biases  relative  to  true.  (Reread  the  first  sentence  of 
this  paragraph.)  It  would  be  damaging  to  the  field  to  adjust  the  field 
values  to  correspond  to  mean  observed  values.  While  such  adjustment 
would  improve  the  verification  of  the  field  relative  to  observed  values,  it 
would  not  improve  the  field  relative  to  the  desired  objective  true  values. 
The  resolution  of  extremals  in  the  distribution  would  especially  suffer. 
Restraint  from  such  actions  may  be  further  encouraged  by  remembering 
that  the  field  may  result  from  other  information,  not  represented  by  the 
verifying  reports. 

Clearly,  if  the  distribution  is  concentrated  along  the  diagonal  then 
the  field  must  be  good  in  the  sense  of  high  resolution  and  absolute 
accuracy.  Distributions  which  are  scattered  about  the  diagonal  should  be 
examined  analytically.  The  frequency  count  in  any  one  cell  is  the  result 
of  contributions  emanating  from  sources  lying  along  the  diagonal  in  the 
ideal  case.  Does  the  distribution  appear  to  be  consistent  with  the 
broadening  of  the  ideal  diagonal  concentration  by  the  addition  of 
variances  alone,  or  is  there  evidence  of  bias  in  portions  of  the  parameter 
range?  This  is  difficult  to  discern;  the  natural  frequency  distribution 
over  the  range  of  the  physical  parameter  complicates  this  analysis.  The 
frequency  peaks  for  an  unbiased  distribution  need  not  ridge  along  the 
diagonal.  A  capability  to  perform  such  an  analysis  objectively  can  be 
formulated  but  its  realization  will  require  a  considerable  amount  of 
ingenuity  and  work. 
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The  aforementioned  considerations  are  basic  to  the  interpretation  of 
frequency  distribution  No.  1 — a  prerequisite  to  appreciation  of  frequency 
distribution  No.  2.  This  second  tabie  is  more  flexible  in  that  the 
tabulations  are  in  ranges  of  difference,  and  these  ranges  may  be 
incremented  differently  from  the  mean-value  ranges.  This  flexibility  is 
essential  for  verifying  fields  of  parameters  for  which  differences  between 
observed  and  field  values  are  generally  small  relative  to  the  physical 
range  of  the  parameter  (e.g.,  sea-level  pressure,  sea-surface  temperature). 
The  second  table  will  probably  be  favored  over  the  first  for  several 
reasons.  Once  understood  it  is  not  so  likely  to  mislead.  It  clearly 
reveals  the  correspondence  between  observed  and  field  values  including 
any  inherent  biases. 

Perhaps  the  most  useful  application  of  the  frequency  table  and 
related  statistics  is  for  comparison  of  different  fields  of  the  same  physical 
parameter,  and  similar  reports,  but  produced  by  different  capabilities 
and/or  with  different  resolutions  in  scale.  How  do  the  statistics  for  a 
predicted  field  compare  with  an  analysis  of  the  same  field?  How  do  the 
statistics  compare  for  a  coarse  grid  versus  a  finer  grid?  How  do  changes 
in  a  field  production  capability  alter  the  field  verification  statistics?  How 
do  the  statistics  compare  for  an  independent  sample  of  reports  versus  a 
dependent  sample? 

(17)  DAILY  RUN  INDEX.  Ordinarily  this  field  verification  statistics 

package  is  to  be  run  once  per  day.  All  fields  generated  in  a 
24-hour  period  are  available  for  selection.  Each  daily  run 
includes  an  index  of  the  fields  verified  in  the  run.  This 
index  appears  at  the  end  of  the  output. 


MULTI-DAY  SUMMARY:  Ordinarily  the  entire  output  of  daily  runs  is 
directed  to  microfiche  and  is  also  saved  in  a  circular  file  which  has  been 
set,  initially,  to  hold  7  days  of  runs  but  which  is  adjustable  to  hold  as 
many  as  31  days  of  runs.  The  User  may  request  the  production  of  a 
muiti-day  summary  after  any  daily  run.  This  supplemental  program  has 
been  designed  to  compile  the  most  recent  runs  held  in  the  circular  file; 
it  has  been  set,  initially,  to  compile  up  to  7  days  of  individual  runs. 
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The  output  of  summary  runs  normally  goes  to  the  printer.  It  begins 
with  a  listing  of  the  product  interpretation  guide. 

For  each  type  of  processed  field  held  in  the  circular  file  the  summary 
program  produces  frequency  distributions  No.  1  and  No.  2,  showing  the 
bivariate  distribution  which  results  from  combining  all  available  daily  runs. 

(18)  LISTING  OF  DAILY  RUNS.  The  frequency  distributions  No.  1 
and  No.  2  are  followed  by  a  list  of  the  individual  cases  of  that 
field  type  found  in  the  circular  file.  In  this  list  each  individual 
case  (i.e.,  single  field  of  the  type)  occupies  a  line  of  output 
which  includes  identification  and  statistical  measures  (8),  (9) 
and  (10)  drawn  from  the  daily  run  of  that  case. 

(19)  SUMMARY-RUN  INDEX.  This  index  appears  at  the  end  of  the 
Summary-run  output.  It  lists  in  order  of  appearance  the  types 
of  fields  which  have  been  separately  summarized. 

6.3.3  Example  of  Verification  Statistics — Daily  Run 

Table  6  shows  OPL2X  verification  statistics  resulting  from  comparison 
of  field  wind  directions  with  wind  directions  provided  by  concurrent 
observations.  The  field,  for  00Z  09  JAN  80,  was  produced  by  the  FNOC 
Planetary  Boundary  Layer  (PBL)  model  on  a  northern  hemisphere  63x63 
analysis  grid,  polar  stereographic  projection.  Corresponding  verification 
statistics  for  wind  speed  are  shown  in  Table  7.  in  both  tables,  numbers 
in  parentheses  refer  to  those  given  in  the  Product  Interpretation  Guide 
(Section  6.3.2). 

6.3.4  Example  of  Verification  Statistics— Summary  Run 

Tables  8  and  9  are  similar  to  Tables  6  and  7  respectively  but  for  a 
00Z  summary  run.  Although  a  7-day  summary  was  specified,  for  this  run 
the  circular  file  contained  only  5  records — i.e.,  only  5  00Z  fields  and 
associated  00Z  observations  were  available  when  the  run  was  made.  The 
number  of  records  utilized  is  shown  at  the  top  of  the  tables.  As  before, 
the  numbers  in  parentheses  refer  to  those  given  in  the  Product 
interpretation  Guide. 
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